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Abstract 

We show that measuring any two quantum states by a random POVM, under a suitable definition of 
randomness, gives probability distributions having total variation distance at least a universal constant 
times the Frobenius distance between the two states, with high probability. In fact, if the Frobenius dis- 
tance between the two states is not too small and their ranks are not too large, even a random orthonormal 
basis works as above. Since a random POVM is independent of the two states, the above result gives 
us the first sufficient condition and an information-theoretic solution for the following quantum state 
distinction problem: given an a priori known ensemble of quantum states, is there a single measurement 
basis, or more generally a POVM, that gives reasonably large total variation distance between every 
pair of states from the ensemble? Large pairwise trace distance is a trivial necessary condition for the 
existence of a single distinguishing measurement for an ensemble; however, it is not sufficient, as seen 
for example by the recent work of Moore, Russell and Schulman | MRS05 1 on hidden subgroups of the 
symmetric group. Our random POVM method gives us the first information-theoretic upper bound on 
the number of copies required to solve the quantum state identification problem for general ensembles, 
i. e., given some number of independent copies of a quantum state from an a priori known ensemble, 
identify the state. Moreover, this upper bound is achieved by a single register algorithm, i. e., the algo- 
rithm measures one copy of the state at a time, followed by a classical post-processing on the observed 
outcomes in order to identify the state. 

The Standard quantum approach to solving the hidden subgroup problem (HSP), which includes 
Shor's algorithms for factoring and discrete logarithm, is a special case of the state identification problem 
where the ensemble consists of so-called coset states of candidate hidden subgroups. Combining Fourier 
sampling with our random POVM result gives us single register algorithms using polynomially many 
copies of the coset state that identify hidden subgroups having polynomially bounded rank in every 
representation of the ambient group. In particular, we get such single register algorithms when the 
hidden subgroup forms a Gel'fand pair, e.g. dihedral, affine and Heisenberg groups, with the ambient 
group, i. e., the rank in every representation is either zero or one. These HSP algorithms complement 
earlier results about the powerlessness of random Fourier sampling when the ranks are exponentially 
large, which happens for example in the HSP over the symmetric group. The drawback of random 
Fourier sampling based algorithms is that they are not efficient because measuring in a random basis is 
not. This leads us to the open question of efficiently implementable pseudo-random measurement bases. 



1 Introduction 

The hidden subgroup problem (HSP) is a central problem in quantum algorithms. Many important problems 
like factoring, discrete logarithm and graph isomorphism reduce to special cases of the HSP. Almost all 
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exponential speedups that have been achieved in quantum computing are obtained by solving some instances 
of the HSR The HSP is denned as follows: Given a function / : G — ► S from a group G to a set S that is 
constant on left cosets of some subgroup H < G and distinct on different cosets, find a set of generators 
for H. Ideally, we would like to find H in time polynomial in the input size, i. e. log |G|. Almost all 
efficient quantum algorithms for solving special cases of the HSP, including Shor's algorithms for factoring 
and discrete logarifhm [Sho97 1, use the same genèric approach sometimes called the Standard method. The 
Standard method for the HSP can be described as follows: evaluate the function / in superposition and ignore 
the function value to get a state of the form ou '■= ^2 g eG \9^){9^V wnere \gH) := -j= = J2 heH \gh), 
i .e., au is a uniform mixture of uniform superposicions over left cosets gH of the hidden subgroup H. A 
state of the form au for some subgroup H < G is called a coset state. The above procedure can be repeated 
t times to get t independent copies of the state au. The aim now is to identify H from aff. 

The coset state based approach to the HSP leads us to consider the following general problem called 
quantum state identification. Given af f from an a priori known ensemble £ = {oi, . . . , a m } of quantum 
states in C n , identify i. A related problem is the following quantum state distinction problem: is there a 
single measurement basis or more generally a POVM M. , that gives reasonably large total variation distance 
between every pair of states in £? The important point here is that we want a single measurement Ai 
that works well for every pair of states. A solution to the state identification problem trivially gives a 
solution to the state distinction problem. It is not hard to see that the converse is also true: a POVM Ai 
with distinguishing power ó, i.é., Ai sol ves the state distinction problem with total variation distance at 
least 6 between every pair of states from £, gives an algorithm that identifies the given state with constant 

probability from t = O (^f^ independent copies. This algorithm is in fact a single register algorithm 

in that it applies t independent copies of Ai to the given of and does a classical 'minimum-finding style' 
post-processing on the observed outcomes to guess i. Single register algorithms may have advantages over 
multi-register algorithms in the interests of efficiency and ease of design; observe that the complexity of a 
genèric /c-register measurement increases exponentially with k. 

In this work, we study information-theoretic aspects of the general state distinction problem, and use it 
as a tool for solving the corresponding state identification problem. We also analyse various implications of 
these two problems, including consequences for the HSP. Our main objective is to find sufficient conditions 
on the ensemble £ to guarantee the existence of a measurement with distinguishing power 5. It is known 
that two quantum states can be 5-distinguished by a measurement if and only if they have trace distance at 
least 5. In general, this measurement depends upon the pair of states to be distinguished. Thus, this result 
does not give us any way to come up with a single measurement Ai is that works well for every pair of 
states. However, it does provide a necessary condition: in order for a POVM with distinguishing power 
ó to exist, every pair of states in £ must have trace distance at least 5. On a concrete note, we show that 
the ensemble of coset states for subgroups of a group G indeed has minimum pairwise trace distance of 1. 
However, constant pairwise trace distance is not sufficient for the existence of a polynomially distinguishing 
measurement, as seen for example by the recent work of Moore, Russell and Schulman [MRS05 ] on hidden 
subgroups of the symmetric group. 

Random POVM and Frobenius distance: In this paper, we present for the first time a sufficient criterion 
for the state distinction problem. Let \\A\\p denote the Frobenius norm of a matrix A, i. e., \\A\\f := 
V52m \Aki\ 2 - F° r a POVM Ai and quantum state a in C n , let Ai(a) denote the probability distribution 
on the outcomes of Ai got by measuring a according to M . Our main result can be stated informally as 
follows. 
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Result 1 (Informal statement). Suppose a\, o~i are two quantum states in C n . Define f := \\a\ — o"2 ||f- If 
rank(o"i) + rank(o"2) is not 'too large', then with probability at least 1 — exp(— Çl(y/n) — exp(— fi(/ 2 n)) 
over the choice of a random orthonormal basis B in C n , \\B(<Tx) — B(<T2)\\i > cf, where c is a universal 
constant. 

Using the above result, we can show that if the minimum pairwise Frobenius distance of an ensemble 
6 = {<r 1; . . . , <7 m } of states in C" is at least /, then with probability at least 1 — exp(— n), a random POVM 
T, with an appropriate notion of randomness, gives total variation distance at least cf between every pair 
of states of £, where c > is a universal constant. The notion of random POVM that we use is as follows: 
attach a zero ancilla in C m , where m := 6 ( nl ° g 2 2m ), and measure cij ® |0)(0| according to a random 
orthonormal basis in C n C m . In addition, as suggested by Result if the maximum rank of a state in 
£ is not too large, then we don't need a POVM at all, a random orthonormal basis in C n will work just 
as well. We also construct examples of density matrices a\, 02 with || cri — cr 2 ||tr = 2, where with very 
high probability the total variation distance given by a random POVM is at most y'Ho'i — cr 2 [|f, unless 
exponentially many ancilla qubits are used to define the random POVM. 

Application to the HSP: Our random POVM method has information-theoretic implications about the 
HSP in a general group G. It is easy to see that the ensemble of coset states for subgroups of G is simulta- 
neously block diagonal in the Fourier basis for G, where a block is labelled by an irreducible representation 
(irrep) of G and a row index. This leads us to consider the so-called random Fourier method for the HSP: 
apply the quantum Fourier transform over G to the given coset state and observe the name of an irrep p 
and a row index i, and then measure the resulting reduced state using a random POVM. Previously, a few 
examples of HSP's were given where random Fourier sampling required exponentially many copies of the 
coset state in order to identify the hidden subgroup with constant probability IG SVVü41 lMRRS041. In these 
examples, the ranks of the blocks of the coset state in the Fourier basis were exponentially large. Using 
the fact that \\A\\f > jàéhi for any matrix A, we prové a surprising positive counterpart to the above 

■y/rank(yl) 

negative results. We show that polynomially many iterations of the random Fourier method give enough 
classical information to identify the hidden subgroup H if the ranks of the coset state in each block in the 
Fourier basis are polynomially bounded. In fact, we define a distance mètric r{H\, H2) between two sub- 
groups H\ , if 2 < G based on the Frobenius distance between the corresponding blocks of the coset states 
and au 2 m tne Fourier basis of G, and show that random Fourier sampling gives total variation distance 
at least Q(r(Hi, H2)) between an 1 and <r# 2 with exponentially high probability. If the ranks of the blocks 
of aiii, o~h 2 are polynomially bounded, then r(H\,H2) is at least polynomially large. The previous work of 
[RRS05 1 also proposed a distance function r'(Hi, H2), but it was difficult to estimate r'{H\, H2) except for 
very special cases. Also, the function r'{H\, H2) is not powerful enough to even show that if the ranks of 
the blocks are cïh 1 , o~h 2 are at most one, polynomially many iterations of random Fourier sampling suffice 
to identify the hidden subgroup with high probability. Our new result improves our understanding of the 
power of single register Fourier sampling, and establishes that the random POVM method can often be a 
powerful information-theoretic tool. 

In particular, for the important special case when the hidden subgroup H forms a Gel' f and pair with 
the ambient group G, i. e., each block has rank either zero or one, 0(log 3 iterations of random strong 
Fourier sampling give enough classical information to identify the hidden subgroup H with high probability. 
For many concrete examples e.g. affine group, Heisenberg group, the number of iterations of random Fourier 
sampling can be brought down to 0(log |G|) by a more careful analysis. Gel'fand pairs have been studied 
extensively in group theory, and a lot of recent work [ MR05 1 on the hidden subgroup problem has involved 
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Gel'fand pairs e.g. dihedral group [EHOO BCD05b| , affine group [MRRS04|, Heisenberg group [RRS05 
BCD05a|. For the dihedral and affine groups, it is possible to give explicit efficient measurement bases for 
the single register Fourier sampling procedure that identify the hidden subgroup with high probability using 
polynomially many copies. Interestingly, for the Heisenberg group no such explicit basis for single register 
Fourier sampling is known, though an explicit efficient entangled basis for two-register Fourier sampling 
is known [BCD05a|. The only proof that polynomially many iterations of single register Fourier sampling 
suffice information-theoretically to identify hidden subgroups in the Heisenberg group is through random 
Fourier sampling, and was frrst observed in [RRS05|. 

Since it can be shown that measuring in a Haar-random orfhonormal basis is hard for a quantum com- 
puter, the main open question that arises from our work is whether there are efficiently implementable 
pseudo-random orthonormal bases for specific ensembles that have good distinguishing power. For exam- 
ple, such a basis for the representations of groups 17 p x Z p , p prime, will give us algorithms for the HSP 
in those groups having an efficient quantum part followed by a possibly súper polynomial classical post- 
processing. For súper constant r, no such quantum algorithm is currently known. Current proposals of 
pseudo-random orthonormal bases I EW S + 03l IELL05I however, seem inadequate for our purposes. 

Application to general state identification: Besides applications to the HSP, our random POVM method 
also has some interesting consequences for the general state identification problem. For an ensemble £ of 
states in C n with minimum pairwise trace distance 5 and maximum rank r of a state, t = O ( rl °£^ ) 



independent copies of a state are enough to identify the state with high probability using t iterations of a 
random POVM. Since r < n, for a general ensemble of quantum states we get t = O ^ nl °f m j which is the 
fïrst upper bound on the number of copies required for the general state identification problem to the best of 



for pure states can be independently proved by a detaiied analysis of Gram-Schmidt orthonormalisation, but 
the resulting measurement is ajorní measurement entangled across t registers. In contrast, note that all the 
state identification algorithms arising from our random POVM result are single register algorithms. 

Related work: The so-called pretty good measurement, also known as the square-root measurement, has 
been proposed in the past as a measurement for the state identification problem [HW94]. Its performance 
is indeed 'pretty good' if the ensemble of states possesses some special symmetries; see e.g. IE MV04I 
and the references therein. The PGM approach has been recently applied to a few instances of the HSP 
also [BCD05b ( BCD05a, MR05|, showing that it maximises the probability of identifying the hidden sub- 
group for those instances. The PGM approach to state identification differs from our approach in an impor- 
tant way: the PGM approach does not usually give single register algorithms for state identification, whereas 
our approach based on state distinction does. This is because the PGM for t copies, in general, is a joint 
measurement and does not decompose as a tensor product of measurements on the individual copies. In 
fact, for the dihedral HSP studied in [BCD05b], an exponential number of iterations of the PGM for a single 
copy are required in order to identify a hidden reflection with constant probability. In contrast, polynomi- 
ally many iterations of 'forgetful' Fourier sampling on single copies give enough classical information to 
identify a hidden reflection in the dihedral group [EHOO]. 

Another problem similar to state distinction is as follows: for two a priori known ensembles £\, £2 of 
quantum states, is there a two-outcome POVM that identines with reasonable probability to which ensemble 
a given state from £\ U £2 belongs? It turns out that the probability of error is related to the minimum trace 
distance between the convex hulls of £\ and £2 [GW05. Jai05], and is 1/2 if the convex hulls intersect. In 




our 



knowledge. For pure states, we get t = O í 




which is optimal up to constant factors. This result 
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contrast, in the state distinction problem we want to find a POVM with many outcomes that gives reasonable 
total variation distance between every pair of states of the ensemble. Having more than two outcomes allows 
us to find a pairwise distinguishing POVM even if the ensemble cannot be partitioned into two parts with 
disjoint convex hulls. 

Proof technique: In order to show that, under suitable conditions, a random orthonormal basis B gives 
total variation distance at least n(||ci — 02||f) between two quantum states a\, 02, we have to analyse B in 
the eigenbasis of o\ — 02. Our techniques differ from earlier work on the power of random basis for state 
distinction [RRS05| in two different ways. First, the paper [RRS05| could not handle an arbitrary pair of 
quantum states o\, 02 because of using weaker symmetry arguments. Using better symmetry arguments and 
a new probabilistic analysis of the Gram-Schmidt orthonormalisation process, we overcome this limitation 
and reduce the problem to proving lower bounds on the tail of weighted sums of squares of Gaussian random 
variables. For the pairs of states considered in [RRS05|, one only needed to prové tail lower bounds for 
an unweighted sum of squares of Gaussian, i. e., one needed to prové tail lower bounds for the chi-square 
distribution. The paper [RRS05 ] proved such bounds using the central limit theorem from probability theory. 
However, since we are now in the weighted case, the statement of the central limit theorem does not quite 
suffice. The main problem is that the central limit theorem cannot guarantee that a weighted sum of squared 
Gaussians exceeds its mean by a Standard deviation with constant probability independent of the number of 
random variables and the weights. To do this, we have to use a powerful quantitative version of the central 
limit theorem known as the Berry-Esséen theorem combined with 'weight smoothening' arguments. This 
allows us to show that the tail of a weighted sum of squared Gaussian exceeds the ^2-norm of the weight 
vector with constant probability. This is in contrast to Chernoff-like upper bounds on the tail of chi-square 
distributions that are more commonly seen in the study of measure concentration for random unitàries. Since 
the ^2-norm of the weight vector is closely related to ||ctí — o^Hf. we g et our main result easily after this. 
The Berry-Esséen theorem also indicates that a random orthonormal basis cannot achieve total variation 
distance much larger than || cri — 02||f, and in fact, we give an example of states a\, 02 with trace distance 
2 where a random basis cannot give total variation distance more than — 02||f with high probability. 

2 Preliminaries 

2.1 Measure concentration in C n 

In this subsection, we prové some simple results about measure concentration phenomena in C n for large n, 
that will be useful in the proof of our main theorem. 

By a Gaussian probability distribution Q, we mean the one-dimensional real Gaussian probability distri- 

— x^ /2 

bution with mean and variance 1, i. e., for iéR, the probability density of Q at x is e ^- . We use <£(•) 
to denote the cumulative distribution function of G, i .e., <fr(x) is the probability that G picks a real number 
less than or equal to x. 

The following tail bound on the sum of squares of n independent Gaussians, also known as the chi- 
square distribution with n degrees of freedom, can be proved Chernoff-style using the moment generating 
function of the square of a Gaussian random variable. 

Fact 1. Let G\ , . . . , G n be independent random variables where each Gi is distributed according to Q. Let 
Y := Y^=i Gl For all e> 0, 

Pr[Y > n(l + e)] < (exp(-e/2) • vT+e) n . 
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The same upper bound also holdsfor Pr[Y < n(l + e)] when — 1 < e < 0. 

Using Fact^ we can prové the following lemma upper bounding the length of the projection of a random 
unit vector onto a fixed subspace. 

Lemma 1. Let W be a k-dimensional subspace ofC n , where k < n/A. Let v be a random unit vector in 
C n . Let Ilw denote the orthonormal projector from C n to W. Suppose 4 < t < n/k. Then, 

k' 



Pr 

Also, far any < e < 1/2, 



\U w (v)\\ 2 >t- 

n 



< exp(-ü(tk)). 



Pr 



:i-e)^<\\U w (v)f<(l+e)^ 



> l-exp(-ü(e 2 k)). 



Proof. We can choose a random unit vector v 6 C™ as follows: choose a random vector v G C n by 
choosing 2n independent real random variables G±, . . . , G2n, where each Gi is distributed according to Q, 
and treating a complex number as a pair of real numbers. Now normalise v to get a random unit vector v; 
note that \\v\\ = with probability 0. By symmetry, we can assume that W is spanned by the first k Standard 

basis vectors in C n Thus, ||rivy(v)|| 2 = h^yà- Using e = -1/2 in FactE we get YjfLi G j > n with 

probability at least 1 — exp(— Q(n)) over the choice of v. Since exp(— e/2) • \/l + e < exp(— e/10) for 
e > 1, using e = í/4 in Fact[j]we get Yntx Gj < with probability at least 1 - exp(-íí(íA;)) over 

the choice of v. Thus, with probability at least 1 — exp(— Q(tk)) — exp(— Q(n)) over the choice of v, 
l|nw( u ) II 2 < ^t 4 ^ fc < — - This completes the proof of the first part of the lemma. 



The proof of the second part of the lemma is very similar, using the inequality exp(— e/2) • \Jl + e < 
-e 2 /3for < e < 1/2. □ 

We now prové a lemma upper bounding the perturbation induced by the Gram-Schmidt orthonormalisa- 
tion process on r random independent unit vectors in C n . 

Lemma 2. Let , . . . , b' r be a sequence of random independent unit vectors in C n , where r < n. Let 
bx, . . . ,b r be the corresponding sequence ofunit vectors got by Gram-Schmidt orthonormalising b'^, . . . , b' r . 
Fix M > 1. Then with probability at least 1 — r ■ exp(— íl(Mr)) over the choice ofb^, . . . , b' r , 

WM\-\bi)ih\L·<o 

far all 1 < i < r, 

Proof. For 1 < i < r, let LTj denote the orthonormal projector from C n to the subspace spanned by 
b'^, . . . , b\. For 1 < i < r — 1, putting t = in the first part of Lemma ^ we see that with probability 
at least 1 - rexp(-íí(Mr)) over the choice of b[, . . . , b' r , \\Ui(b' i+1 )\\ 2 < O (^). Recali that b i+x := 

H + i-^(ï +1 n- Hence ' 





n^ +1 )|| 2 + (1-11^+1-^(6'^; 



l-Jl- 11^(0^)112 = 2 - 2Jl - \\U t (b' l+1 ) 



< O 



ÍMr 
\ n 
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The proposition now follows from the fact that for two unit vectors \<p), — \(j)){4>\ ||tr < 

2\\v)-\oY\. □ 



We will require the following fact about the size of a <5-net in C n . A <5-net J\í is a finite set of unit 
vectors in C n with the property that for any unit vector v G C n , there exists a unit vector v' € N such that 
||f — v'\\ < S. The fact follows from the proof technique of [Mat02 ( Lemma 13.1.1, Chapter 13] and by 
identifying C n with M. 2n . Below for 1 < j < n, ej denotes the jth Standard unit vector in C n , viz., the 
n-tuple containing a 1 in the jth location and zeroes elsewhere. 

Fact 2. Fix any 5 6 (0, 1]. Then, there is a 8-net N in C n containing the n Standard unit vectors e±, 
such that \J\f\ < (| 



, 2n 



Using Fact |2j we can prové the following lemma upper bounding the spectral norm of an n x n matrix 
whose entries are independent random complex numbers with independent Gaussian real and imaginary 
parts. 

Lemma 3. Define a random nxn complex matrix M by independently choosing each entry to be a complex 
number whose real and imaginary parts are independently chosen according to the Gaussian distribution 
Q. Then, with probabïlity at least 1 — exp(— í7(nlogn)) over the choice of M, \\M\\ < 0{^Jn logn). 

Proof. Let 5 := 1/y/n. Let J\í be a 5-net in C n guaranteed by Fact [2] Fix any unit vector v E C n . 
By symmetry, the probability distribution of ||Mu|| 2 is the same as that of ||Mei|| 2 , i. e., the probability 
distribution of ||Md|| 2 is the same as that of the sum of squares of 2n independent Gaussians. Let t := 
Clogn, where C is a sufficiently large constant whose value will become clear later. Since exp(— e/2) ■ 
VT+1 < exp(-e/10) for e > 1, using e = t in Fact[U we get that ||iWV|| 2 < (t + l)n for all v' G Àí with 
probability at least 1 — (4y / n) 2n • exp(— Çl{Cn\ogn)) > 1 — exp(— O(nlogn)) over the choice of M. 
Note that for any vector w £ C n , we have 



ii 

\Mw\\ 2 - X ^ 



í=i j=i 




n n 
i=l 3=1 



\w\\ 2 



^llMejH 2 < \\w\\ 2 n 2 (t + 1) 
i=i 



The inequality above follows from Cauchy-Schwartz. Now fix any unit vector v £ C n . Let v' be the closest 
vector to v from M, where ties are broken arbitrarily. Thus, \\v — v'\\ < S. We have 

||M?j|| 2 = (v\M*M\v) = {v' + (v- v')\M^M\v' + (v - v')) 
= \\Mv'\\ 2 + (v'\M ] M\v - v') + {v- v'\M ] M\v') + ||Af(u - v')\\ 2 

< \\Mv'\\ 2 + 2||Mu / ||||M(í; - v')\\ + \\M(v - v')\\ 2 

< (t + l)n + 2a/(í + l)n ■ \\v - v'\\ ■ ny/i + 1 + \\v - v'\\ 2 n 2 (t + 1) 

< (t + l)n + 2n 3/2 (í + 1)5 + 5 2 n 2 (t + 1) < 0(n log n). 

The first inequality above follows from Cauchy-Schwartz. The proof of the lemma is now complete. □ 

Finally, we will require the following Berry-Esséen theorem from probability theory, which is a quanti- 
tative version of the central limit theorem [Fel71 Chapter XVI, Section 5, Theorem 2]. 
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Fact 3 (Berry-Esséen theorem). Let X±, . . . , X n be independent random variables. Define \i{ := E[ATj], 
Ui := (E[\Xi - /ijl 2 ]) 1 / 2 , pi := (E[\Xi - pi] 3 ]) 1 ^. Define the quantities 



1=1 



P 3 :=£pf, 

í=i 



X :-. 



1 n 



i=i 



Thenfor all x € 



\Pr[X <x\- ®{x)\ < -A T . 



Remark: The constant 6 in the Berry-Esséen theorem can be improved; the current record is 0.7915 by 
Shiganov [Shi86|. However, Proposition ^ belo w holds as long as the constant is rinite and independent of 
n and the random variables X\, . . . , X n . 

Using Fact|3j we prové the following proposition which will play a central role in the proof of our main 
theorem. 

Proposition 1. Let G\ , . . . , G n be independent random variables where each Gi is distributed according to 
Q. Let Ai, . . . , À n € (0, 1]. Define 



i=i 



^A 2 , X :- ^ \Gf. 



i=l 



i=l 



Suppose t < 1. Then, there is a constant c independent ofn and Ai, . . . , A n such that 

Pr[X >t + f]>c and Pr[X <t]>c. 

Proof. Without loss of generality, Ai > • • • > \ n . Let K\ be a sufficiently large constant, whose choice 
will become clear later. Suppose Ai > Note that < / < t. There is a constant c% depending on K\ 
but independent of n and Ai, . . . , A ra such that Pr[Gf > 2K{\ > c\, which implies that 



Pr[X >t + f]> Pr[AiG 2 > 2t] > Pr 



G\ > 2t 



Pr [G\ > 2Ki] > ci. 



Also, 



t = E[X] > t ■ Pr[t < X < t + /] + (í + /) Pr[X > t + /] 
= t • Pr[X >*]+/• Pr[X >t + f] 

> t ■ Pr[X > t] + — • ci 
Ai 



t ■ (1 - Pr[X < í]) + 
Pr[X<t]>* 



tci 
K~i 



Now, suppose Ai < jl·. Define independent random variables AQ := AjG 2 . Let pi, Ui, pi be defined as in 
Fact|3] Recali that E[G 2 ] = 1, E[|G 2 - 1| 2 ] = 2 and that the absolute third central moment of G\ is fmite, 
say equal to A^. Then, 

6 Ztl Pi _ 6^2 E?=i Af < 6^í < 3if 2 



2ÜTi 
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Taking x = A= in Fact|3j we get 

1 \\ 3K 2 



Similarly, taking x = in Fact|3]we get 

3K 2 _ 1 3K 2 
Ki ~ 2 Ki' 



Pr[X <t]> $(0) 



Choosing K\ to be a sufficiently large constant, we see that there exists a universal constant c 2 suen that 

Pr[X > í + /] > c 2 and Prpí < t] = Pr[X < í] > c 2 . Now letting c := min j-^-, c 2 |, we have that 

Pi[X > t + /] > c and Pr[X < í] > c always. Observe that c is a universal constant independent of n and 
Ài,...,À n . □ 

2.2 Quantum state distinction versus identification 

In this subsection, we explore the connection between the problems of quantum state distinction and state 
identification. 

A quantum state in C n is modelled by a density matrix a, which is an n x n Hermitian, positive semidef- 
inite matrix with unit trace. A positive operator-valued measure, or POVM for short, is the most general 
measurement on quantum states. See e.g. INC00I for a good introduction to density matrices and POVM's. 
A POVM M. in C n is a finite collection of positive operators Ei on C n , called elements of M., that satisfy 
the completeness condition ]T\ Ei = t n . If the state of the quantum system is given by the density matrix 
a, then the probability pi to observe outeome labelled i is given by the Born rule pi = Tr(aEi). We use 
M.(a) to denote the probability distribution on the outeomes of M. got by measuring a according to M.. 
The trace norm of an n x n matrix A is defined as || A\\ tr := TrV A^A. The Frobenius norm of A is defined 
as ||A|| F := s/TtATa, which is nothing but the ^2-norm of the long vector in C n corresponding to A. The 
following fact follows easily from the Cauchy-Schwartz inequality. 

Fact 4. For any matrix A, \\A\\p > 



-y/rank(A)' 

Suppose there is an a priori known ensemble £ = {ai , . . . , <r m } of quantum states in C n . Given t copies 
of a state <Tj, a single register state identification algorithm A for the ensemble £ consists of a sequence of 
POVM's Tj, 1 < j < t, where Tj operates on the jth copy of crj. There is no bound on the number 
of outeomes of Tj. The choice of Tj may depend on the observed outeomes of T\, . . . After t 

observations, A does a classical post-processing and declares its guess for i. For all i, 1 < i < m, we want 
A to guess i with probability at least 3/4. 

Let < 5 < 2. A POVM Ai for the state distinction problem with distinguishing power 5 for the 
ensemble £ is a POVM with the property that ||.M(<7j) — A^(<Tj)||i < 5 for all 1 < i < j < m. It is easy to 
see via the triangle inequality that if there exists a single register state identification POVM on t copies, then 
there exists a state distinction POVM with distinguishing power 0(l/í). The following fact is a converse to 
the above observation; a proof sketch is included for completeness. 

Fact 5. Let £ = {oi, . . . , a m } be an a priori known ensemble of quantum states in C™. If there is a POVM 
M for the state distinction problem with distinguishing power 5 for the ensemble £, then there is a single 
register state identification algorithm Afor ensemble £ working ont = OÍ lo | 2 m 
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Proof. Fix 1 < i < j < m. Under the promise that the unknown state is either Oi or aj, applying Ai 
to each of t copies of the unknown state followed by a maximum likelihood estimate identifies the correct 
state with probability at least 1 — i, as can be seen by a Standard Chernoff bound. Let Fíj denote this 
maximum likelihood routine. The identification algorithm A starts by applying Ai on each of t copies 
of the unknown state, which a priori can be any oi G £. After that, A does m — 1 iterations of a classical 
minimum-fïnding style post-processing procedure comparing two possible states cr», Oj in an iteration, using 
the classical routines Fy on the t observed outcomes. Note that the same t observed outcomes are reused by 
the various routines F^j ; no fresh measurements are done. The success probability of the minimum-fïnding 
style post-processing, and hence algorithm A, is at least 1 — > 3/4. □ 

2.3 Hidden subgroup problem and quantum Fourier transform 

In this section, we explain the importance of the quantum Fourier transform as a means of attacking the hid- 
den subgroup problem. For a general introduction to representation theory of finite groups, see e.g. IS er77l . 

We use the term irrep to denote an irreducible unitary representation of a finite group G and denote by G 
a complete set of inequivalent irreps. For any unitary representation p of G, let p* denote the representation 
obtained by entry-wise conjugating the unitary matrices p(g), where g G G. Note that the definition of p* 
depends upon the choice of the basis used to concretely describe the matrices p{g). If p is an irrep of G so 
is p* , but in general p* may be inequivalent to p. Let V p denote the vector space of p, define d p := dim V p , 
and notice that V p = V p *. The group elements \g), where g G G form an orthonormal basis of C' G L Since 
J2 pe Q^p = 1^1' we can consider another orthonormal basis called the Fourier basis of C' G I indexed by 
\p, i, j), where p G G and i,j run over the row and column indices of p. The quantum Fourier transform 
over G, QFT G is the following linear transformation: 

pe g V i,j=l 

It follows from Schur's orthogonality relations (see e.g. I Ser771 Chapter 2, Proposition 4, Corollary 3]) that 
QFT G is a unitary transformation in C' G L 

For a subgroup H < G and p G G, define p(H) := J2 hGH p{h). It follows from Schur's lemma 
(see e.g. [Ser77, Chapter 2, Proposition 4]) that p{H) is an orthogonal projection to the subspace of V p 
consisting of vectors that are point-wise fixed by every p(h), h G FL. Define r p {H) := rank(p(fí)). No- 
tice that r p (H) = r p *{H). The Standard method of attacking the HSP in G using coset states [GSVV04| 
starts by forming the uniform superposition — 1= ]T\ G |g)|0). It then queries / to get the superposi- 



tion — YIqgg Ignoring the second register the reduced state on the first register becomes 

y\G\ y 

the density matrix au = Ylg&G \9H){gH\, that is the reduced state is a uniform mixture over all 
left coset states of H in G. It can be easily seen that applying QFT G to au gives us the density matrix 
W ©peg ®t=i \P^)ÍP^\ ® P*{H), where p*(H) operates on the space of column indices of p. Since 
the states au are simultaneously block diagonal in the Fourier basis for any H < G, the elements of any 
POVM Ai operating on these states can without loss of generality be assumed to have the same block struc- 
ture. From this it is clear that any distinguishing measurement without loss of generality first applies the 
quantum Fourier transform QFT G to oh, measures the name p of an irrep, the index i of a row, and then 
measures the reduced state on the column space of p using a POVM Ai p in C dp . This POVM Ai p may 
depend on p but is independent of i. 
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The probability of observing an irrep p in this quantum state is given by Vh{p) = dp \ H ^ p<yH ^ . Con- 
ditioned on observing p we obtain a uniform distribution l/d p on the row indices. The reduced state on 
the space of column indices after having observed an irrep p and a row index i is then given by the state 
p*(H)/r p (H), and a bàsic task for a hidden subgroup fïnding algorithm is how to extract information about 
H from it. In this paper, we will investigate the case when M. p is a random POVM, for a suitable defï- 
nition of randomness, in C dp . We shall call this procedure random Fourier sampling. Grigni, Schulman, 
Vazirani and Vazirani [ GS VV04 1 show that under certain conditions on G and H, random Fourier sampling 
gives exponentially small information about distinguishing H from the identity subgroup. In this paper, 
we prové a complementary information-theoretic result viz. under different conditions on G, (log IGI) ^ 1 ) 
random strong Fourier samplings do give enough information to reconstruct the hidden subgroup H with 
high probability. 

In weak Fourier sampling, we only measure the name of an irrep and ignore the reduced state on the 
column space. It can be shown [ HRTS03 1 that for normal hidden subgroups H , no more information about 
H is contained in the reduced state. Thus, weak Fourier sampling is the optimal measurement to recover a 
normal hidden subgroup from its coset state. In particular, Fourier sampling is the optimal measurement on 
coset states for the abelian HSR 

Define a distance mètric w{H\,H2) ■= \\Phi — 'Ph 2 \\i = J2 P ^Q \^ > H 1 (p) — 7 ? h 2 (p)\ between sub- 
groups H\,H2 < G. Adapting an argument in [HRTS03|, it can be shown that w(H\, H2) > 1/2 if the 
normal cores of H\ and H2 are different [RRS05 1. Recali that the normal core of a subgroup H is the largest 
normal subgroup of G contained in H. Thus, the main challenge is to distinguish between hidden subgroups 
H\, H2 from the same normal core family. 

We next show that coset states corresponding to different hidden subgroups of a group have trace dis- 
tance at least 1. 

Proposition 2. Let H\, H2 be different subgroups of a group G. Then, \\o~h 1 — o~H 2 \\tr > 1- 

Proof. For a subgroup H < G, we let G/H denote a complete set of left coset representatives of H in G. 
Since for any c\ G G/H\, 



c€G/(H in H 2 ) 

cH\=c\H\ 



we get 



°^ = ^ E Ici g i>fa g il = |G| g2 ' E \c(H 1 nH 2 ))(c'(H 1 nH2) 



ci&G/Hi c,c'eG/(H 1 nH 2 ) 

cH x =c'H x 



A similar fact is true for ajj 2 . We now define 

_ \H 
~—\G\ 



à Hl := lH \ f l H21 E \c(H 1 nH2))(c\H 1 nH2)\. 



c,c'6G/(H 1 nH 2 ) 

cHi=c'Hi,c£c' 



We define &h 2 similarly. Note that àjji, (?h 2 are Hermitian and for any c G G/(H\ n H2), {c{H\ n 
H 2 )\à Hl \c{H l n H 2 )) = and (c(iíi D H 2 )\à Hl \c{H l n H 2 )) = 0. 
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We now observe that for any c, d G G/ {H\ n íf 2 ), 

« c (fTi n fí· 2 )| < 7 Hl |c'(fr 1 n tf 2 )) / o) a ((<<#! n F^l^lc'^ n tf 2 )) ^ 0) c = c'. 

This is because cifi = c/iïi and cií 2 = c'iï 2 implies that c{H\ n iï 2 ) = c'(í/i n iï 2 ), i. e. c = c'. This 
implies that for any c,d € G/ (fií fi iT 2 ), 

({c(íïi n jt 2 )|<t íï1 |c / (íí 1 n ií 2 )> = 0) v (( c (fíi n ^2)1^1^1 n# 2 )> = 0). 

Thus, àu 1 àH 2 = &H 2 àHi = 0. Also, it follows that cr^ — ou 2 = &H 1 — gh 2 - 
Without loss of generality, H\ is not a subgroup of ií 2 . Now, 



- CH 2 ||tr = ||0"Hi - 0-H 2 ||tr = 1W (£ffi ~ OíT; 



i2 ) 2 



= ^v^i + *k - Tr v = ^ lltr - 

The inequality follows from the fact that àf Ii , ò 2 H ^ are positive semidefinite operators and the square-root 
function is monotonically increasing for such operators. In order to evaluate Hó-HiHtr» notice that òn x = 

lJL· Í^ 2Ï ©dGG/H! M C1 , where for any Cl G G/H,, 



M CI := Yl HHinHzWiHin^i 



c,c'eG/(H 1 nH 2 ) 

cH 1 =c'H 1 =c 1 H 1 

Now observe that M C1 is of the form J — I, where J, / are the j^^^j x \h^t\h 2 \ au ones an d identity 
matrices respectively. Hence, ||M cl || tr = 2 ( \h^h 2 \ ~ ^) ^ or a ^ Cl G Gj}í\. Thus, 

|ffinff 2 | |G| / l·gjj ,\ 2(\H 1 \-\H 1 nH 2 \) 

The inequality follows from the fact that H\ n iï 2 is a proper subgroup of ífi, since .Hi is not a subgroup 
of i7 2 - This completes the proof of the proposi tion. □ 



3 Random measurement bases and Frobenius distance 

In this section, we prové our main result showing that a random POVM, for a suitable defmition of random- 
ness, distinguishes between two density matrices by at least their Frobenius distance with high probability. 
We íirst prové an important technical lemma that quickly implies our main theorem. 

Lemma 4. Let a±, er 2 be two density matrices in C n . Define f := ||<7i — ct 2 ||f· Then: 

1. 7frank(o"i) + rank(er 2 ) < \/n/K, where K is a sufficiently large universal constant, then with 
probability at least 1 — exp(— íí(y / n)) — ^ -exp(— íí(/ 2 n)) over the choice of a random orthonormal 
measurement basis B in C", ||Ü?(<7i) — H(cr 2 )||i > Q(f ); 
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2. Take a set B of n independent random vectors B := {61, . . . , b n } in C n , where each 6, is got by 
choosing n independent complex numbers whose real and imaginary parts are independently chosen 
according to the Gaussian Q. Define l := HEILi ^lll an d v := ^C" ~~ j Eí=i ^i^í· Let Ai denote 

b b^ 

the POVM on C n consisting of the elements -j 1 for 1 < i < n, and the element v. Note that M. 
can be implemented as an orthonormal measurement in C n (g> C 2 . Then with probability at least 

1 - exp(-íí(n)) overthe choice ofB, \\M(ai) - M(a 2 )\\i > ^ (15^ 

Proof. We start by proving the first part of the lemma. Define t := \\a% — o^Htr- We have rank(ai — 02) < 
\fnjK, where K is a sufficiently large universal constant whose value will become clear later. Let B := 
{|6i), . . . , \b n }} be a random orthonormal basis of C n . Let B(a±), B(a 2 ) denote the probability distributions 
on [n] got by measuring a\ , a 2 respectively according to B. Let Ai , . . . , Afc denote the positive eigenvalues, 
and —fik+i, ■ ■ • j —fJ-k+i the negative eigenvalues of o\ — a 2 . Note that k + l = rank(<7i — 02) < \/n/K. 
We assume that we work in the eigenbasis of o\ — a 2 . Hence, we can write 

k k+l k k+l 



ai - a 2 



i=l 



j=k+l 



1=1 



j=k+l 



E^ 2 + 



k+l 
j=k+l 



Without loss of generality, Ei=i \ 2 - Ej=fc+i 
inequality t < fyk + /. Then, 



Eí=i A i ^ Z 2 / 2 - Also > b y the Cauchy-Schwartz 



n 



\\B(a 1 )-B(cr 2 )\\ 1 =J2 

k 

^ Aj \(b t \i 



Wi\b t ) - <s* |cr 2 |St> = khWt - ^lèí 



n 

E 

í=l 



í=l 

2 



í=l 



1=1 



j=fc+l 



Define the random n x n unitary matrix B to be the matrix whose row vectors are (b\ |, ... , (ò n |. Then, 

||B(<Ji) - B(cr2) ||l = E"=i EiU M®ti\ 2 - Y.jtí+1 VjlBtjl 2 • Instead of generating the random unitary 

matrix B row-wise, we can generate it column-wise. The advantage now is that we only have to randomly 
generate the first k + 1 orthonormal columns; the rest of the columns can be assumed to be zero without loss 
of generality. That is, we generate an n x (k + l) matrix B whose columns are random orthonormal vectors 
|&i), . . . , \ bk+i) in C n . To generate the matrix B, we generate an n x (k + l) matrix B' whose columns are 

j Wk+i) m C 1 » an d apply Gram-Schmidt orthonormalisation to get 



random independent unit vectors {b^) 
\bi),...,\b k+ i). Choosing M = K2 ^ +l) 2 in Lemma E we get \\\b t )(b t 
1 <t < k + l with probability at least 1 — (k + l) exp í — f ^fen 



-l^^lllt^O^jforall 
> 1 — exp(— Í^y'ra/ET)) over the 



choice of B' . Let B(a\) — B(a 2 ) and B'(a\) — B'(a 2 ) denote the functions on [n] defined by 



(B(ar) - B(a 2 ))(t) := £* =1 M^t)? 



E*=t + i míIW)! 2 = Eti MBu? " E*=l + i H&il 2 , 



(B'(ai) - B'(a 2 ))(t) := ZL·i | <^ l*> | 2 - E;=Ífi Mil WP 



E 



\ i 2 
i=i M D ti\ 



Ek 
j 



k+l 11 Afí' I 2 
fc+i l l ]\ c 'tj\ 



respectively, where 1 < t < n. We now have 
||B(.7i)- B(e 2 )\U = ||B(<7,)- 8(^)11, 



E 

í=l 



i=i 



k-\-i 
j=fe+i 
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> 



E 

t=i 

n 

E 

í=i 



Ï>|<61| 



k+l 



k+l 

E »jW\ 
j=k+i 



E 

í=i 



j2k(m\t)\ 2 -\(k\t)í 



i=i 



i=fc+i 

> \\B'{a 1 )-B'(a 2 ) 
= \\B'{a x )-B'{a 2 ) 



E a *Ep^>i 2 -iwi 2 

i=l t=l 
k 

J2^i\\\ b 'i)( b 'i\ - \ b i){bi\\\tr - 

j=k+] 

k / 1 \ 

7=fc+l 



i=l 



fc+i ?1 

- E /*£|i$ 

j=/c+i t=i 

E ^\\\ b 'j)( b 'j\-\ 



ií)i 2 -i^ií)i 2 

b j){ b j\\\tr 



o 



K^k + Í) ^ \K^k 



with probability at least 1 — exp ^—0 P/rl J over trie choice of B'. The third inequality follows from the 
fact that the trace distance between two quantum states is an upper bound on the total variation distance 
between the probability distributions got by performing a measurement on the two states. 

We generate B' by first generating an n x (k + l) matrix B whose entries are independent complex- 
valued random variables whose real and imaginary parts are each independently distributed according to the 
Gaussian Q, and then normalising each column of B in order to get B'. Let b\, . . . , b^+i denote the columns 
of B. Since exp(-e/2) • ^/TTe < -e 2 /3 for < e < 1/2, using e = //10 in Fact[j]we see that with 

probability at least 1 — (k + l) exp(— 0(/ 2 n)) over the choice of B, ||ò;|| 2 < 2n (l + for 1 < i < k 

and \\bj\\ 2 > 2n ^1 — ^ for k + 1 < j < k + l. Consider any fixed t, 1 < t < n. By Proposition[ï] with 
probability at least c 2 over the choice of B, 



^Ail^l 2 > 2^Ai + 



i=l i=l 
Call the above event Et. If Et occurs we have 

k k+l 



k k+l k+l 

2 A 2 > í + / and ^ ^\B ti \ 2 < 2 ^ N =t. 

j=k+l j=k+l 



j=k+i 



i=l 



> 



> 



t + f 



tf 



2n(l + i) 2n(l - X 



lOníl 



100- 



+ 



/ 



2n(l + ^ 



/ 1 

2n\i + VÏ 5 



10 



6n 



Since the events E t for different t are independent, usin| a Standard Chernoff bound, with probability 
at least 1 — exp(— íí(n)) over the choice of B, at least ^ different t will satisfy the above inequality. 
This means that with probability at least 1 — exp(— Q(n)) — (k + l) exp(— í7(/ 2 n)) over the choice of B, 
\\B\ai) - B'(a 2 )\\i > Thus, with probability at least 1 - exp(-íí(n)) - (k + l) exp(-í7(/ 2 n)) - 
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exp ^— (tt)) > 1 — exp ^—$7 (tt)) ~~ TT ' ex P( — ^(/ 2n )) over trie choice of arandom orthonormal 
basis Bof C n , \\B(ai) - B(a 2 )\h > TT " °(f/ K )- 

Since c is a universal constant, we can choosing K to 
be a suffïciently large universal constant thus proving the first part of the lemma. 

We now proceed to the proof of the second part of the lemma. Let Ai , . . . , be the positive eigenvalues 
and — fife+i, • • • , — \i n the non-positive eigenvalues of a\ — a 2 . By symmetry, we can assume that we are 
working in the eigenbasis of a\ — a 2 , i- e., the eigenbasis of a\ — a 2 is the computational basis. Define the 
n x n matrix B to be the matrix whose column vectors are b\, . . . , b n . Suppose v is a unit vector in C m . 
Then, 

m n 

(v\ £ bib\\v) = \v j bi\ 2 = \\v^B\\ 2 = \\B^v\\ 2 . 



i=i 



i=l 



Hence we have 



|£>6}|| = max(v\J2b4\v) = 0f = \\Bf, 



i=l 



where the maximum is taken over all unit vectors v G C n . The second equality follows because Y^í=i hb\ 
is a positive matrix. By Lemma |3j l = \\B\\ 2 < O(nlogn) with probability at least 1 — exp(— f2(ralogn)) 



over the choice of B. 
Now, 



1 n l 

\\M(*i) - M(a 2 )\\i = - \b\dbt - b\a 2 b t + {Tr^v) - Tr(a 2 v) 



í=i 



2 n \ n 



í=i 



t=i 



E^i^i 2 - E ^i 6 

j=k+l 



lm 2 



i=i 



> n 



1 



raloen 



t=i 



j=k+i 



i=i 



By Proposition^and a Standard Chernoff bound, we see that with probability at least 1 — exp(— Q(n)) over 

2 

the choice of B, for at least different í, 



j=k+i 



i=i 



> 



2j> + 



i=l 



\ 2 E A * 2 - 2 E mí><+/-* = /- 



Thus, with probability at least 1 — exp(— íí(re)) — exp(— f2(nlogra)) > 1 — exp(— Í2(n)) over the choice 
ofB, 



\\M(a l )-M(a 2 )\\ l >ü 



1 



reloen 



n 



n 



f 



v log n / 

since c is a universal constant. Since the POVM M. can be refined to a POVM with 2n rank one elements, 
M. can be implemented as an orthonormal measurement in C™ ® C 2 . This completes the proof of the second 
part of the lemma. □ 



We are now finally in a position to prové the main theorem of the paper. 
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Theorem 1. Let o~\, a 2 be two density matrices in C n . Define f := \\ai — c^Hf- Then: 

1. Let K > 1 be a sufficiently large quantity. Consider an ancilla space C m initialised to zero, where 
itt, > 4n £ . Let B be a random orthonormal measurement basis in C™ (8) C m . Let Ai denote the 
POVM on C n got by attaching ancilla |0) to a state in C n and applying the orthonormal mea- 
surement B in C n C m . Then with probability at least 1 — exp(— Çl(Kn)) over the choice of B, 
\\M(a 1 )-M(a 2 )\\i>n(f); 

2. Let K > 1 and define m := Kn. Take a set B ofm independent random vectors B := {b\, . . . ,b m } 
in C n <S> C K , where each 6j is got by choosing m independent complex numbers whose real and 
imaginary parts are independently chosen according to the Gaussian Q. Define £ := W^iL·i ^lll 
and v := l cn( ^ C K — \ Y^iLi bity. Let Ai denote the POVM on C n got by tensoring a zero ancilla 
over C K to states in C n and then performing the POVM Ai in C n <S> C K consisting of the elements 
6-6Í 

-j 1 - for 1 < i < m, and the element v. Note that Ai can be implemented as an orthonormal 
measurement in C n (8) C 2K . Then with probability at least 1 — exp(— fí(m)) over the choice of B, 



\\M(ax) - M(a 2 )\\x > Í2 



log m 



Proof. In order to prové the first part of the theorem, let K be at least as large as the universal constant 
in the first part of Lemma|4] Thus, we start out with two density matrices a\ := a\ ® |0)(0|, ct 2 := 
(T2 ® |0> <0| in C n (8) C m . Trivially, rank(CTi) + rank(o- 2 ) = rank(cri) + rank(cr 2 ) < 2n < s/mn/K. Also, 
II ^1 — &2\\f = || fi — 02 Hf- By the first part of Lemma|4j with probability at least 1 — exp(— fi(ynm)) — 
■ exp(— Çl(nmf 2 )) > 1 — exp(— ft(Kn)) over the choice of a random orthonormal basis B of C n (8> 

C m , ||A4(o-i)-A4(í72)||i = ||jB(ai)-jB(ff2)||i > 0(/).This completes the proof of the first part of the 
theorem. 

A very similar strategy allows us to prové the second part of the theorem using the second part of 

LemmalU □ 



Remark: The point to note in the second part of the theorem is that the construction of the random POVM 
Ai does not require a priori knowledge of ||o"i — ct^Hf- This will be useful in the application to the HSP, in 
the proof of Theorem 

Finally, we present an example of a pair of density matrices o\ , a 2 where with high probability a random 
POVM cannot achieve a total variation distance much larger than \J\\a\ — o"2||f> unless the dimension 
of the ancilla used by the POVM is exponentially larger than rank(cri) + rank(cJ2). This is essentially 
because a sum of independent random variables cannot deviate from its mean by much more than its Standard 
deviation. 

Proposition 3. Let o~\, a 2 be completely mixed states supported on two orthogonal r-dimensional subspaces 
ofC n . Note that \\o~i — o~ 2 \\p = \/2/r and \\o~i — a 2 \\ tr = 2. Let B be a random orthonormal basis in C n . 
Then, with probability at least 1 — nexp(— ^/r) over the choice ofB, \\B(ai) - B{a 2 )\\i < O^- 1 / 4 ). 

Proof. Let B = {\b\), . . . , |6 n )}- Let W\, W 2 denote the supports of o\, a 2 respectively. Then, cij = ^11^. 
Since each \bt) is a random unit vector in C n , putting e = Cr -1 / 4 , C a universal constant whose value will 
become clear later, in the second part of Lemma^ we get < (6í|ctí|6í) < for i = 1,2 and all 
1 < t < n, with probability at least 1 — n exp(— y/r) over the choice of B. Thus, 

n n _ 

\\B{u x ) - B((7 2 )||i = 5Zl(6tki|&t) - {bt\o 2 \b t )\ <Y^~<2e. 

t=i t=i n 
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This completes the proof of the proposition. □ 

Now, if we think of C n as C 2r (g> C m , where m := £=, we see that a random POVM in C 2r cannot 
distinguish between cr\, a 2 by much more than \/\a\ — ^IIf» unless n is exponentially large compared to 
rank(cJi) + rank(o"2). 



4 Random measurement bases and the HSP 

In this section, we study the implications of Thecrem^for the hidden subgroup problem. 

Theorem ^ is in most cases not immediately useful in obtaining single register algorithms for the HSP. 
This is because for two candidate hidden subgroups H\, H 2 , \\cfhi — o"H 2 ||f < ||0*.Hi||f + Hc^IIf = 
Y^^p + J^. Thus, even though \\(Jh 1 — &H 2 lltr > 1 by Proposition \\o~h-i_ — o~H 2 II F can be exponen- 
tially small if \H\\, \Ü2 \ are exponentially small compared to \ G\. In most examples of interest this is indeed 
the case. Fortunately, we can make good use of the fact that the coset states for different subgroups of G 
are simultaneously block diagonal in the Fourier basis of G. Hence, we investigate the power of random 
Fourier sampling in distinguishing between coset states. The advantage of this is that after doing the quan- 
tum Fourier transform and measuring an irrep name and a row index, we may be left with a reduced state 
on the space of column indices with polynomially bounded rank. If this happens, the average Frobenius 
distance between the blocks of an x and <jh 2 will be polynomially large even though \\(Jh 1 — o~h 2 ||f may be 
exponentially small. In fact, for several cases of the HSP studied in the literature, the rank of the reduced 
state is in fact either or 1 i. e., the hidden subgroup forms a Gel'fand pair with the ambient group. 

To make the above reasoning precise, we define a new distance mètric between two coset states <t#i , 
(Jh 2 - Below, we use the notation of Section 1231 

Defínition 1 {r{H\, H2)). Let G be a group and H\,H 2 < G. Define 

r{H u H 2 ) := w(H 1 ,H 2 ) + J • ^(í p \\\HMHi) - \H 2 \p(H 2 )\\ F 

peG 

The importance of r{H\,H2) follows from the following theorem. 

Theorem 2. Let G be a group and H\, H2 < G. Let M denote the POVM corresponding to the following 
random Fourier sampling procedure: apply QFT G to the given coset state, measure the name of an irrep 
p G G and a row index i, and then apply a random POVM Ai p on the resulting reduced state on the space of 

column indices, where Ai p is defined as in the second part of Theorem[I\with K p := glo | ^ , where C 

is a sufficiently large universal constant. Then with probability at least 1 — exp(— log 2 |G|) over the choice 
ofM, \\M(a Hl )-M(aH 2 )\\i>n(r(H l ,H 2 )). 

Proof. Let o\, 02 be two quantum states andpi,p2 > 0. Suppose p\ > p2- Then, 

||pioi — í?2Ct 2 ||f < ||í>iOi - ct 2 )||f + ||(pi -P2)o- 2 \\f < Pi\Wi - ct 2 \\f + \pi — P2I- 

Now, 

WpiMiax) - p 2 M(a 2 )\\i = || P i(M(í7i) - M(a 2 )) + (pi - P2)M(a 2 )\\i > ^\\M{a x ) - M{<t*)\\i. 
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The inequality above follows by considering those outcomes of Ai that have at least as much proba- 
bility for o\ as for a 2 , and the fact that (p\ — p 2 )Ai(a 2 ) is a vector with non-negative entries. Also, 
\\ Pí M{ai) -p 2 M(a 2 )\\ 1 > \ Pl -p 2 \. Now suppose ||A4(ffi) - M(a 2 )\\i > ^p^, where L > 1. 
Then, 

IIPl^(o-i) - P2M(a 2 )\\ 1 > \E1_pA + - M{a 2 )\\i 

. \Pl~P2\ . Pi i, n . |bl<7"l -P2<T 2 \\V 

^ 77 1" 77- O"! - (72 p > — . 

4L 4L AL 

Now suppose we apply QFT G and measure an irrep name p and a row index i. We apply the above 
reasoning to the random POVM M p with L = log Using the second part of Theorem ^ we get that 
with probability at least 1 - exp(-log 2 \G\) over the choice of M p , \\M p (p(Hi)) - M p (p(H 2 ))\\i > 
Q ^ MIL·isíIi^^^MíhÉE. j . Hence for the random Fourier sampling POVM Ai, with probability at least 
1 — exp(— log 2 |G|) over the choice of Ai, 

\\M(a Hl ) - M(a Ha )\\í > n l ]Gll l glG] • £ d P WWMHx) - \H 2 \p(H 2 )\\ F J . 

The theorem now follows because random Fourier sampling always does at least as well as weak Fourier 
sampling. □ 

The following corollary is now easy to prové. 

Corollary 1. Let G be a group. Suppose for every irrep p £ G and subgroup H < G, rank(p(iï)) < 
(log IGI) ^. Then the random Fourier method of Theorem \2\gives rise to a single register algorithm 
identifying with probability at least 3/4 the hidden subgroup H from (log IGj) ^- 1 copies of o~h- 

Proof. Consider two distinct subgroups H\ , H 2 < G. Since coset states are block diagonal in the Fourier 
basis of G, using Theorem |2j Proposition[2]and Fact|4]we get 

1 < \\<T Hl - ^lltr = E|^ll|tfl|p(# 1 ) " \H2\ P {H 2 )\\ ix 

< £ III^iIp(^i) - \H 2 \p(H 2 )\\ F ■ (rank^)) +rank(p(tf 2 ))) 

peG 

< (loglGI)^ 1 ) • \^^\\\HMHi) - \H 2 \p(H 2 )h 

< (loglGD^T^!,^). 

Let Ai denote the random Fourier sampling POVM of Theorem |2] Then with probability at least 1 — 
exp(-log 2 |G|) over the choice of M, \\M(a Hl ) - M(a H2 )\\i > Çl{r(H u H 2 )) > (log | ) — - Since 
a group G can have at most 2 log ' G l subgroups, by the union bound on probabilities, with probability at least 
l-exp(-íí(log 2 |G|)) over the choice oí Ai, \\M{a Hl ) - M(a H2 )\\i > (log \G\)~ ^ for all subgroups 
Hi,H 2 < G. The corollary now follows from Fact|5] □ 
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Finally, we remark that in many important examples of the HSP where most of the probability lies on 
high dimensional irreps and the blocks corresponding to these irreps have low rank, one can save a factor of 
log \ G\ in the denominator of the definition of r{H\,H2) and prové Theorem|2]with this improved definition 
of r(Hi,H2). This improvement follows by using the first part of Lemma|4]instead of the second part of 
Theorem ^in the proof of Theorem |2] Such a saving can be done, for example, for suitable subgroups of 
the affme group, Heisenberg group and groups TI p x Z p , p prime, r > 2. 

5 The general state identification problem 

In this section, we study the implications of Theorem ^ to the state identification problem for a general 
ensemble of quantum states. To the best of our knowledge, this problem does not seem to have been studied 
before. The following theorem with r = n gives an upper bound on the number of copies required to identify 
a given state information-fheoretically with high probability for any ensemble. 

Theorem 3. Let £ = {a±, . . . , o~k} be an a priori known ensemble of quantum states in C n . Suppose the 
minimum trace distance between a pair of states from £ is at least t. Let r denote the maximum rank of a 
state in £. Then, there is a POVMM in C n such that M® 1 acting on af £ gives enough classical information 

to identify i with probability at least 3/4, where £ = O ' rlogfc 



Proof. Define f := -X=. Let Ai be the random POVM guaranteed by the second part of Theorem ffl with 

m := 16nK ^ og m . Fix any pair of states <7j, o~j, i ^ j from £. Then with probability at least 1 — 
exp(— Q(8n log m)) > 1 ±y over the choice of Ai, 

\\M(a z ) - MMh > ~ ajh) > íï f } a \ a ^ ) >ü(- 

\yjTimk{ai - <jj) ) VV 

By the union bound on probabilities, there is a POVM Ai on C n such that the above inequality holds for 
every pair of states from £. By Fact|5] applying Ai® í on af l , where 1 = ^ r l °i k j gives enough classical 
information to identify i with probability at least 3/4. □ 
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